Search CORE

89 research outputs found

Recommended from our members

Deductible imputation in administrative medical claims datasets

Author: Cliff Betsy Q.
Eddelbuettel Julia C. P.
Eisenberg Matthew D.
Meiselbach Mark K.
Publication venue
Publication date: 19/01/2024
Field of study

Objective: To validate imputation methods used to infer plan-level deductibles and determine which enrollees are in high-deductible health plans (HDHPs) in administrative claims datasets. Data sources and study setting: 2017 medical and pharmaceutical claims from OptumLabs Data Warehouse for US individuals Study design: We impute plan deductibles using four methods: (1) parametric prediction using individual-level spending; (2) parametric prediction with imputation and plan characteristics; (3) highest plan-specific mode of individual annual deductible spending; and (4) deductible spending at the 80th percentile among individuals meeting their deductible. We compare deductibles' levels and categories for imputed versus actual deductibles. Data collection/extraction methods: Not applicable. Principal findings: All methods had a positive predictive value (PPV) for determining high- versus low-deductible plans of ≥87%; negative predictive values (NPV) were lower. The method imputing plan-specific deductible spending modes was most accurate and least computationally intensive (PPV: 95%; NPV: 91%). This method also best correlated with actual deductible levels; 69% of imputed deductibles were within $250 of the true deductible. Conclusions: In the absence of plan structure data, imputing plan-specific modes of individual annual deductible spending best correlates with true deductibles and best predicts enrollees in HDHPs.</p

Knowledge UChicago

Synthesizing efficacious genistein in conjugation with superparamagnetic Fe<sub>3</sub>O<sub>4</sub> decorated with bio-compatible carboxymethylated chitosan against acute leukemia lymphoma

Crossref

University of Dundee Online Publications

A User-Friendly Hybrid Sparse Matrix Class in C++

Author: B Stroustrup
C Sanderson
D Eddelbuettel
D Vandevoorde
E Anderson
IS Duff
L Rosen
R Curtin
RB Lehoucq
TH Cormen
XS Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

When implementing functionality which requires sparse matrices, there are numerous storage formats to choose from, each with advantages and disadvantages. To achieve good performance, several formats may need to be used in one program, requiring explicit selection and conversion between the formats. This can be both tedious and error-prone, especially for non-expert users. Motivated by this issue, we present a user-friendly sparse matrix class for the C++ language, with a high-level application programming interface deliberately similar to the widely used MATLAB language. The class internally uses two main approaches to achieve efficient execution: (i) a hybrid storage framework, which automatically and seamlessly switches between three underlying storage formats (compressed sparse column, coordinate list, Red-Black tree) depending on which format is best suited for specific operations, and (ii) template-based meta-programming to automatically detect and optimise execution of common expression patterns. To facilitate relatively quick conversion of research code into production environments, the class and its associated functions provide a suite of essential sparse linear algebra functionality (eg., arithmetic operations, submatrix manipulation) as well as high-level functions for sparse eigendecompositions and linear equation solvers. The latter are achieved by providing easy-to-use abstractions of the low-level ARPACK and SuperLU libraries. The source code is open and provided under the permissive Apache 2.0 license, allowing unencumbered use in commercial products

arXiv.org e-Print Archive

Crossref

University of Queensland eSpace

bcROCsurface: an R package for correcting verification bias in estimation of the ROC surface and its volume for continuous diagnostic tests

Author: CT Nakas
CT Nakas
D Eddelbuettel
J Luo
K To Duc
Khanh To Duc
N Novoselova
T Yu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Bayesian spatial extreme value analysis of maximum temperatures in County Dublin, Ireland

Author: Climate Change Advisory Council
Cressie N.
Diggle P.
Eddelbuettel D.
IPCC
IPCC
Matérn B.
Nychka D.
R Core Team
Rafferty J. P.
World Meteorological Organization
Publication venue
Publication date: 01/01/2019
Field of study

In this study, we begin a comprehensive characterisation of temperature extremes in Ireland for the period 1981-2010. We produce return levels of anomalies of daily maximum temperature extremes for an area over Ireland, for the 30-year period 1981-2010. We employ extreme value theory (EVT) to model the data using the generalised Pareto distribution (GPD) as part of a three-level Bayesian hierarchical model. We use predictive processes in order to solve the computationally difficult problem of modelling data over a very dense spatial field. To our knowledge, this is the first study to combine predictive processes and EVT in this manner. The model is fit using Markov chain Monte Carlo (MCMC) algorithms. Posterior parameter estimates and return level surfaces are produced, in addition to specific site analysis at synoptic stations, including Casement Aerodrome and Dublin Airport. Observational data from the period 2011-2018 is included in this site analysis to determine if there is evidence of a change in the observed extremes. An increase in the frequency of extreme anomalies, but not the severity, is observed for this period. We found that the frequency of observed extreme anomalies from 2011-2018 at the Casement Aerodrome and Phoenix Park synoptic stations exceed the upper bounds of the credible intervals from the model by 20% and 7% respectively

arXiv.org e-Print Archive

MURAL - Maynooth University Research Archive Library

Crossref

Maynooth University ePrints and eTheses Archive

NUI Maynooth Eprint Archive

Assessing United States county-level exposure for research on tropical cyclones and human health

Author: Al-Hamdan Mohammad
Anderson Brooke G.
Crosson William
Eddelbuettel Dirk
Ferreri Joshua
Guikema Seth
Peng Roger D.
Quiring Steven
Schumacher Andrea
Yan Meilin
Publication venue: 'Environmental Health Perspectives'
Publication date: 28/10/2020
Field of study

Includes bibliographical references (pages 067007-12-067007-13).Background: Tropical cyclone epidemiology can be advanced through exposure assessment methods that are comprehensive and consistent across space and time, as these facilitate multiyear, multistorm studies. Further, an understanding of patterns in and between exposure metrics that are based on specific hazards of the storm can help in designing tropical cyclone epidemiological research. Objectives: a) Provide an open-source data set for tropical cyclone exposure assessment for epidemiological research; and b) investigate patterns and agreement between county-level assessments of tropical cyclone exposure based on different storm hazards. Methods: We created an open-source data set with data at the county level on exposure to four tropical cyclone hazards: peak sustained wind, rainfall, flooding, and tornadoes. The data cover all eastern U.S. counties for all land-falling or near-land Atlantic basin storms, covering 1996–2011 for all metrics and up to 1988–2018 for specific metrics. We validated measurements against other data sources and investigated patterns and agreement among binary exposure classifications based on these metrics, as well as compared them to use of distance from the storm’s track, which has been used as a proxy for exposure in some epidemiological studies. Results: Our open-source data set was typically consistent with data from other sources, and we present and discuss areas of disagreement and other caveats. Over the study period and area, tropical cyclones typically brought different hazards to different counties. Therefore, when comparing exposure assessment between different hazard-specific metrics, agreement was usually low, as it also was when comparing exposure assessment based on a distance-based proxy measurement and any of the hazard-specific metrics. Discussion: Our results provide a multihazard data set that can be leveraged for epidemiological research on tropical cyclones, as well as insights that can inform the design and analysis for tropical cyclone epidemiological researc

Mountain Scholar (Digital Collections of Colorado and Wyoming)

rapidGSEA: Speeding up gene set enrichment analysis on multi-core CPUs and CUDA-enabled GPUs

Author: Andreas Hildebrandt
Bertil Schmidt
C Backes
Christian Hundt
D Eddelbuettel
DAF Alcantara
E Glaab
G Marsaglia
JH Hung
L Geistlinger
L Zhang
MB Eisen
Pellagatti
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

hsphase: an R package for pedigree reconstruction, detection of recombination events, phasing and imputation of half-sib family groups

Author: A Efros
BJ Hayes
Brian P Kinghorn
C Gondro
C Gondro
C Gondro
C Hoze
Cedric Gondro
D Eddelbuettel
D Eddelbuettel
D Eddelbuettel
D Edwards
D He
Julius HJ van der Werf
MH Ferdosi
MH Ferdosi
Mohammad H Ferdosi
MPL Calus
Seung Hwan Lee
SR Browning
SY Su
T Druet
T Meuwissen
The R Development Core Team
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Interpreting transcriptional changes using causal graphs: new methods and their practical utility on public networks

Author: A Ellis
A Gutteridge
AH Bild
Alex Gutteridge
AM Binshtok
AM Slavotinek
B Amati
Ben Sidders
Carl Tony Fakhry
D Bouhassira
D Eddelbuettel
D Ilic
D Laifenfeld
Daniel Ziemek
DG Jamieson
E Cocolakis
G Rodriguez-Martinez
HZ Chen
J Pollard
K Ren
K Zarringhalam
Kourosh Zarringhalam
L Chindelevitch
L Chindelevitch
LJ Jensen
MO Nandan
Parul Choudhary
Ping Chen
V Belcastro
Y Kawasaki
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Improving gene-set enrichment analysis of RNA-Seq data with small replicates

Author: A Liberzon
A Subramanian
BR Zeeberg
BS Carver
C Lee
C Trapnell
CW Law
CW Law
D Eddelbuettel
D Nam
D Nam
D Nam
D Nam
D Nam
D Wu
DC Koboldt
Dongmei Li
Dougu Nam
F Rapaport
GK Smyth
H Jiang
H Li
HL Li
J Li
J Li
JC Marioni
JH Bullard
JJ Goeman
JK Pickrell
JK Schwarz
JX Feng
KA Gray
MA Dillies
MA Newton
MD Robinson
MD Robinson
MD Robinson
MD Young
ME Ritchie
MI Love
Q Xiong
Q Xiong
S Anders
S Song
Seon-Young Kim
Sora Yoon
U Nagalakshmi
V Saxena
W Huang da
WT Barry
X Wang
X Wang
Z Wang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 09/11/2016
Field of study

Deregulated pathways identified from transcriptome data of two sample groups have played a key role in many genomic studies. Gene-set enrichment analysis (GSEA) has been commonly used for pathway or functional analysis of microarray data, and it is also being applied to RNA-seq data. However, most RNA-seq data so far have only small replicates. This enforces to apply the gene-permuting GSEA method (or preranked GSEA) which results in a great number of false positives due to the inter-gene correlation in each gene-set. We demonstrate that incorporating the absolute gene statistic in one-tailed GSEA considerably improves the false-positive control and the overall discriminatory ability of the gene-permuting GSEA methods for RNA-seq data. To test the performance, a simulation method to generate correlated read counts within a gene-set was newly developed, and a dozen of currently available RNA-seq enrichment analysis methods were compared, where the proposed methods outperformed others that do not account for the inter-gene correlation. Analysis of real RNA-seq data also supported the proposed methods in terms of false positive control, ranks of true positives and biological relevance. An efficient R package (AbsFilterG- SEA) coded with C++ (Rcpp) is available from CRAN.open

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

ScholarWorks@UNIST

FigShare